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ABSTRACT 

The Ultraviolet (UV) continuum slope /?, typically observed at z ~ 7 in Hubble Space 
Telescope (HST) WFC3/IR bands via the J—H colour, is a useful indicator of the age, 
metallicity, and dust content of high-redshift stellar populations. Recent studies have 
shown that the redward evolution of /3 with cosmic time from redshift 7 to 4 can be 
largely explained by a build up of dust. However, initial claims that faint z ~ 7 galaxies 
in the Hubble Ultra Deep Field WFC3/IR imaging (HUDF09) were blue enough to 
require stellar populations of zero reddening, low metallicity and young ages, hitherto 
unseen in star-forming galaxies, have since been refuted and revised. Here we revisit the 
question of how best to measure the UV slope of z ~ 7 galaxies through source recovery 
simulations, within the context of present and future ultra-deep imaging from HST. We 
consider how source detection, selection and colour measurement have each biased the 
measurement of /3 in previous studies. After finding a robust method for measuring j3 in 
the simulations (via a power law fit to all the available photometry), we remeasure the 
UV slopes of a sample of previously published low luminosity z « 7 galaxy candidates. 
The mean UV slope of faint galaxies in this sample appears consistent with an intrinsic 
distribution of normal star-forming galaxies with /3 ~ —2, although properly decoding 
the underlying distribution will require further imaging from the ongoing HUDF12 
programme. We therefore go on to consider strategies for obtaining better constraints 
on the underlying distribution of UV slopes at z ~ 7 from these new data, which will 
benefit particularly from the addition of imaging in a second J-band filter: F140W. 
We find that a precise and unbiased measurement of /3 should then be possible. 

Key words: galaxies: high-redshift - galaxies: evolution - galaxies: formation - galax- 
ies: starburst - cosmology: early Universe 



1 INTRODUCTION 

Recent observations with WFC3/IR on board the Hubble 
Space Telescope (HST) have begun to probe the hitherto 
unconstrained properties of galaxies at z ~ 7. The efficient 
'wedding-cake' strategy of combining wide and deep imaging 
from the Cosmic Assembly Near-infrared Deep Extragalac- 
tic Legacy Survey (CANDELS) programme (Grogin et al. 
2011; Koekemoer et al. 2011) with ultra-deep imaging in 
the Hubble Ultra Deep Field (HUDF; e.g. Bouwens et al. 
2010b; Bunker et al. 2010; Finkelstein et al. 2010; Lorenzoni 
et al. 2011; McLure et al. 2010, 2011; Oesch et al. 2010b, 
2012a,b; Wilkins et al. 2011; Yan et al. 2011) is now begin- 
ning to provide large, statistical, samples of z £s 7 galaxies 
with good dynamic range in luminosity. The faintest of these 
galaxies are expected to be a key source of the ionising pho- 
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tons required to reionise the universe, yet constraining their 
contribution relies on measuring their UV colours in order 
to estimate the galaxies' production rate of Hydrogen ionis- 
ing photons and the fraction of such photons which escape 
the galaxy to reionise the IGM (e.g. Robertson et al. 2010; 
Finkelstein et al. 2012a). 

The UV spectrum can be parametrized by the UV con- 
tinuum slope f3, defined by fx oc A' 3 (e.g. Meurer et al. 1999), 
such that a flat spectrum with zero colour in the AB mag- 
nitude system has /3 = —2. A UV slope of /3 = — 2 is typ- 
ical of a young, un-reddened, low-metallicity, star-forming 
galaxy at z ~ 2 (e.g. Erb et al. 2010). The current interest 
in the UV slopes of z ~ 7 galaxies began when the op- 
tical ACS data in the HUDF (Beckwith et al. 2006) was 
complimented by the WFC3/IR Y, J, ff-band data of the 
HUDF09 programme (G011563, PI: Illingworth; Bouwens 
et al. 2010a). The first substantial catalogues of z ^ 7 Ly- 
man Break Galaxies (LBGs) were available following the 
first epoch of this programme in 2009. Hereafter, we refer to 
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data taken prior to and during this first HUDF09 epoch as 
HUDF09E1. Later studies have made use of further data ob- 
tained in a second epoch, and we refer to the stack of all the 
WFC3/IR data from epochs 1 and 2 as HUDF09FULL. In 
this paper we also refer to the WFC3 Early Release Science 
(ERS, Windhorst et al. 2011) programme, which provides 
shallower imaging over a wider (36.5 sq. arcmin) field than 
the HUDF (4.5 sq. arcmin). 

In an initial foray into the measurement of j3 at 
z ~ 7, Bouwens et al. (2010b) found evidence for a colour- 
magnitude relation such that the faintest z ^ 7 galaxies ex- 
hibited sufficiently blue average UV colours ((/?} = — 3.0±0.2 
at —19 ^ M\jy, ab —18) that extremely young ages 
and ultra-low metallicities could be inferred. If confirmed, 
it would be necessary to not only account for the rapid 
evolution of stellar populations from z ^ 7 — > 6, but also 
to conclude that the UV photon escape fraction must be 
sufficiently high at z ~ 7 that nebular continuum emis- 
sion does not significantly redden the observed SED. With 
the same dataset - the HUDF09E1 - Finkelstein et al. 
(2010) found similarly extreme P values, although with a 
sufficiently large error that it was not necessary to invoke 
'exotic' stellar populations. In fact, they suggested that 
only moderately young, dust-free, stellar populations are re- 
quired to reproduce the observed colours. With improved 
data in the final HUDF09FULL, a revised assessment re- 
ported by Bouwens et al. (2012) retains a clear colour- 
magnitude trend, albeit with the faintest objects averaging 
only (/?) = -2.68 ± 0.19 ± 0.28 (biweight mean ± random 
± systematic uncertainties). Significantly, Finkelstein et al. 
(2012b) also find that the full HUDF09FULL dataset pro- 
vides somewhat redder colours for the faintest 1 objects with 
(P) = -3.07 ±0.51 in HUDF09E1 (Finkelstein et al. 2010), 
and (0) = -2.68to.ll (~ ~ 2 - 4 after bias correction) in 
HUDF09FULL (Finkelstein et al. 2012b). Already it should 
be clear from these revised estimates and their quoted un- 
certainties that the typical UV spectral slope of the faintest 
galaxies at z « 7 is not well constrained. 

In response to the initial claims of the discovery of 
exotic stellar populations at z w 7, Dunlop et al. (2012) 
demonstrated that measurement biases can result in a pop- 
ulation of normal star-forming galaxies with (P) « —2 be- 
ing observed as a population of extremely blue objects, es- 
pecially when average properties are calculated for objects 
close to the detection limit of the imaging. This is because, 
as the scatter in observed colour inevitably rises when the 
flux-density limit of the survey is approached, the methods 
used to select LBGs (either simple colour-colour selection, 
or multi-band photometric redshift determination) can start 
to preferentially exclude genuine high-redshift objects whose 
colours have been scattered to very red values (treating them 
as likely lower-redshift interlopers). It is important to stress 
that this is not the same as saying that LBG selection is lim- 
ited to extremely blue objects. In fact the commonly-used 
colour-colour selection criteria, and photometric-redshift se- 
lection techniques can admit quite red LBGs, especially if 
the photometric-redshift technique is not confined to the 
most secure candidates. Nevertheless, because photometric 
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scatter can result in extreme (indeed often unphysical) val- 
ues of p for individual objects, the reddest objects can be 
"lost" to ostensibly low photometric redshifts, while the ex- 
treme blue objects never are (with resulting implications for 
the calculation of average values of /?). 

Contrary to the claim made by Bouwens et al. (2012), 
Dunlop et al. (2012) focussed on the inclusion of all candi- 
date objects with even a marginally-preferred high-redshift 
LBG solution, and showed that, with existing data, there 
still exists a significant blue bias (A(/3) « —0.5) in the in- 
ferred value of (P) for the faintest LBGs at z « 7 (the bias 
is simply more extreme if only the most robust LBG candi- 
dates are considered; e.g. McLure et al. 2011). As we show 
later in this paper, this level of bias also applies to colour- 
colour selected samples (e.g. Bouwens et al. 2012) and is 
exacerbated by the imposition of a J-band flux threshold in 
the galaxy sample selection (e.g. Bouwens et al. 2010b). 

From the above discussion, it is clear that the steep- 
ness of the UV slope for sub-L* , z ~ 7 galaxies remains an 
open question. The situation is further confused by the fact 
that different studies have used different datasets, selection 
methods and techniques for measuring /3 (e.g. from a single 
near-IR colour or from SED fitting). The first objective of 
this paper is therefore to use simulated data to investigate 
the impact of image depth, selection biases and measure- 
ment techniques on the recovered values of p. Then, based on 
our findings, we explore possible strategies for the optimum 
analysis of the new, deeper WFC3/IR imaging of the HUDF 
which will be provided by the HUDF 12 project (including 
imaging in an additional wave-band (J140); G012498, PI: 
Ellis; public data release by early 2013), in order to extract 
the most robust, least-biased estimate of /3 for the faintest 
LBGs at z « 7. 

Throughout this paper we will consider z « 7 LBGs 
to be objects selected with photometric redshift solutions 
in the range 6.5 ^ z ^ 7.5. As with previous studies in this 
area, we do not consider z ~ 8 galaxies given the lack of data 
redward of the Lyman break 2 (which begins to attenuate the 
Ji25-band flux at z > 7.9). 

This paper is laid out as follows. In Section 2, we sum- 
marise three methods of measuring /3 - from a single colour, 
a power-law or the best-fitting galaxy-model SED. In Sec- 
tion 3, we provide a description of our simulation pipeline. 
In Section 4 we compare with and endorse the conclusions of 
Dunlop et al. (2012). In Section 5, we show the results of sim- 
ulated HUDF -09E1,-09E2,-12 and ERS datasets, compar- 
ing various selection methods and the three P measurement 
methods. We present a re-analysis of the Dunlop et al. (2012) 
z « 7 LBG sample in Section 6. Strategies for analysing the 
HUDF12 data are presented in Section 7, wherein we briefly 
discuss the effect of Lyman Alpha Emitters in Section 7.3. In 
Section 8, we present our conclusions and outline our plans 
for future studies. 

Where relevant, we assume a cosmology with Qo = 
0.3, SIa = 0.7, Ho — 70 kms _1 Mpc _1 and quote magni- 
tudes in the AB system (Oke 1965). For convenience, we 
use B435, V606, «775, Z85o,Yo98,Yi05, J125, J140 and Hieo to re- 
fer to the HST ACS F435W, F606W, F775W, F850LP and 
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Figure 1. The typical Spectral Energy Distribution (SED) of a z = 7 Lyman Break Galaxy, and the filters used to observe it. The 
SED shown by a thick black line is a f} = —2 Bruzual & Chariot (2003) stellar population model (in this case a 60 Myr old 0.2 Zq star- 
burst), attenuated by Madau (1995) prescribed IGM absorption. The red and orange filter profiles shown are HST ACS zsso, WFC3/IR 
Ylos> J125, and H\qq - those filters that probe the rest-frame UV at z ~ 6 to 7 in the HUDF09 dataset. Shown in blue above the SED 
are the locations of the ten Calzetti et al. (1994) UV windows, used in the 'best-fitting model' measurement method (see Section 2.3). 
The unshaded Calzetti et al. (1994) window is neglected in our fitting (see text). Two grey SED curves show the Lyman break position 
at z = 6.5 and 7.5, illustrating that, within the 2^7 sample, the Lyman break always attenuates the light within the Yios-band. The 
^098 (F098M) filter used in the ERS observations approximately spans the shorter two thirds of Yios's wavelength coverage, without 
overlapping J125. 



WFC3/IR F098M, F105W, F125W, F140W and F160W fil- 
ters respectively. 



2 METHODS OF DETERMINING (3 

In this section, we describe three methods for measuring the 
UV slope of high-redshift galaxies and show that, for per- 
fect photometry, they yield similar results. Later in Section 
5.3, we explore their relative strengths for estimating P in 
realistic simulated data. 



2.1 Single colour ((3j-h) 

The UV spectral index P may be approximated from a single 
colour using no prior assumptions of the underlying spec- 
trum. Where the colour comprises two filters comfortably 
redward of the Lyman break it is insensitive to small errors 
in the photometric redshift. Moreover, IGM absorption and 
any Lyman-a emission present do not contaminate the con- 
tinuum slope measurement through filters fully redward of 
1216 A in the rest-frame. With WFC3 photometry of objects 
at z > 6.5, 

Colour = Pj-h = 4.43( J125 - flieo) - 2 (1) 

is typically used to estimate /?, where the coefficient 4.43 is 
found from the filter pivot wavelengths (e.g. Tokunaga & 
Vacca 2005) using l/[2.5 log(A H /Aj)], where Xj = 12486 A, 
Xh = 15369 A when the instrument throughput including 
the detector response function is included (Dressel 2011). 



2.2 Power-law ((3yjh) 

Objects whose rest-frame UV continuum is present in sev- 
eral filters redward of the Lyman break should in princi- 
ple have their UV slope better constrained by using all the 
available information. In the HUDF09 and ERS at z ~ 6.5, 
Y{o98|i05}, J125, and -H160 lie redward of the Lyman break 
and the additional use of the Y — J colour here should im- 
prove the constraint on /i over the use of only a single J — H 
colour. By z w 7.5, the Lyman break diminishes the Yios- 
band flux by almost a half, and a power-law fit should begin 
to approach a single colour. However it is not immediately 
clear whether employing the Y-band for galaxies in the red- 
shift space (6.5 ^ z ^ 7.5) in which the Lyman break is 
travelling through Y will be beneficial, given the potential 
for a colour-dependent misplacing of the break within the fil- 
ter. Moreover, high equivalent-width Lyman-a emission lines 
could bias P measurements to significantly bluer values than 
the intrinsic continuum slope, an effect we investigate in Sec- 
tion 7.3. In the power-law /3 measurements presented below, 
the photometric redshift of an object is used to build a grid 
of SEDs with varying power-law P values redward of the 
Lyman break, and zero flux at A < 1216 A. Synthetic pho- 
tometry of each power-law SED is created, and an object's 
YJH photometry is used to select the best-fitting P from 
the grid via a x 2 fit- 

2.3 Best-fitting stellar population synthesis model 

(/3BC03) 

To allow a measure of the rest-frame UV continuum slope 
unaffected by absorption and emission features, Calzetti, 
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Figure 2. The UV slope /3, measured using the three methods 
discussed in Section 2, for five input SEDs. The inputs are various 
ages of a BC03 population synthesis model at z = 7 (adopting a 
Chabrier 2003 IMF, 0.2 Zq metallicity, and a single burst model). 
Ages of 2.4, 22, 61, 130, 250 Myr give -3, -2.5, -2, -1.5, -1. 
Although all three methods agree perfectly for true power-law 
SED inputs (not shown), it can be seen that small discrepancies 
(A/3 < 0.2) exist for more realistic input SEDs. 



Kinney, & Storchi-Bergmann (1994, hereafter C94) defined 
ten spectral windows in the rest-frame UV avoiding signifi- 
cant spectral features. While defined for use on continuum 
spectra, Finkelstein et al. (2012b) advocate the use of the 
windows on photometric data via SED fitting. The use of 
synthetic population synthesis models allows this 'pseudo- 
spectroscopic' measurement to make full use of the available 
photometry. Our implementation of this method uses FAST 
(Kriek et al. 2009) to perform SED fitting of multi-band 
photometry, returning both the photometric redshift and 
the best-fitting Bruzual & Chariot (2003, hereafter BC03) 
population synthesis model SED. The C94 windows then se- 
lect the regions of the SED for the power-law fit (a linear fit 
of log fx vs. log A) . The blue limit of the resultant P parame- 
ter space, /9 m in = —3.2, is governed by the lowest metallicity 
(0.05 Zq) and youngest (1 Myr) simple stellar population 
included in the grid. This differs slightly from the approach 
of Finkelstein et al. (2012b), who use eazy (Brammer, van 
Dokkum, & Coppi 2008) to obtain the photometric redshift 
before further SED fitting with BC03 models (or updated 
variants). However, locking the redshifts with eazy prior to 
fitting the stellar populations with FAST shows no apprecia- 
ble improvement in the recovery of P or photometric redshift 
with respect to the input values. 



2.4 Cross-checking the methods 

As shown in Fig. 2, the three methods generally agree to 
within A/3 < 0.2 when provided with perfect photometry 
of a BC03 SED. With such data, we find that the /3 Y jh, 



Pj-h and /3bco3 methods agree better when the reddest of 
the ten C94 windows is neglected. As shown in Fig. 1, the 
reddest C94 window is redward of the Jfi6o-band at z ~ 7 
and therefore purports to probe a region of the intrinsic SED 
not covered by the photometry. Thus, any spectral features 
present in the BC03 models in that region (e.g. the slight 
jump in flux at 1.8 \im observer-frame in Fig. 1) will cause 
discrepancies between P as measured from colours alone and 
from the best-fitting model. We believe this issue may par- 
tially explain the offset between Pj-h and /3bco3 seen in fig. 
3 of Finkelstein et al. (2012b). For this work, we therefore 
adopt the nine shortest-wavelength C94 windows. 



3 SIMULATION METHODOLOGY 

In this section, we present our method for creating mock cat- 
alogues of high-redshift galaxies and producing multi-band 
images of them with realistic noise properties. The subse- 
quent object recovery, redshift and colour fitting of these 
simulated galaxies is then described. The resulting simula- 
tions are used thereafter to study the measurement of P at 
z^7. 



3.1 Stellar population choice 

In the simulations used for the remainder of this work, we 
adopt two model SEDs for simplicity of comparison. BC03 
models of 0.2 Z Q metallicity with a Chabrier (2003) IMF 
and ages of 2.4 and 61 Myr are chosen to give SEDs with 
/3i n ~ —3 and p nl ~ —2 respectively. These models deviate 
slightly from perfect power law SEDs, allowing the three 
P measurement methods to yield different results (see Fig. 
2). However, by using BC03 models in preference to pure 
power-laws we are better able to realistically represent the 
true SEDs of high-redshift galaxies. In our simulations, the 
input SEDs have not included the reddening due to inter- 
stellar dust. Whilst not relevant to the f3 measurement itself 

- being only an adjustment of the intrinsic /3 distribution 

- it is illustrative to see how low the reddening must be to 
allow galaxies to be observed with f3 ~ —3. Fig. 3 demon- 
strates this; for example /3 = —3 requires Av ^0.1 and an 
age under 30 Myr with a low metallicity BC03 model. How- 
ever, dust quantities Ay < 1 (perhaps reasonable at high 
redshift) cannot redden an SED to /3 = — 1 before the popu- 
lation is tens or hundreds of Myr old. Populations adopting 
a constant star formation history would additionally require 
high metallicity to reach p = —1, as /3 is almost indepen- 
dent of age. Wilkins et al. (2012) show simulations detailing 
how the star formation history and metal enrichment history 
additionally affect the /3 distributions. 



3.2 Simulated image creation 

Our simulation design departs somewhat from that used in 
recent studies. Rather than inserting sources into the real 
data images, we perform fully synthetic simulations. This 
choice allows a consistent treatment of existing datasets 
and the anticipated HUDF12 dataset. In Section 3.3 we 
verify that this choice does not affect the measured scat- 
ter in /3. Our simulation begins by computing theoretical 
magnitudes for galaxies of both stellar population models, 
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Figure 3. Contours showing intrinsic stellar population /3 values 
in the Age-Ay parameter space. Ay parametrizes the dust atten- 
uation, calculated according to the Calzetti et al. (2000) prescrip- 
tion. The population model is the same BC03 1/5 Zq burst used 
throughout our simulations (for solar metallicity, add +0.25 to 
each contour's /3 value). 



at a range of redshifts and absolute magnitudes, through 
the observed filters. To create sky images, empirical HST 
Point Spread Functions (PSFs) are initially inserted into 
blank images. The inserted PSFs are randomly spatially dis- 
tributed, but pixel-centred, with the relative number density 
at each redshift slice given by the evolving luminosity func- 
tion of McLure et al. (2009, eqn. 3) which reasonably repro- 
duces the observed z — 7 (McLure et al. 2010) and z = 8 
(Bradley et al. 2012) functions. In the simulations the ab- 
solute number density is arbitrarily boosted to allow more 
robust statistics to be derived. We mitigate source confu- 
sion (n sourccs ^0.1 per aperture) by an arbitrary choice of 
image size and by neglecting to insert objects at redshifts 
significantly distant from those of interest. For instance, the 
input catalogue does not feature a z ~ 2 interloper popu- 
lation (although objects may freely be designated as such 
'low- z escapees' by the photometric redshift code). Redshift 
z = 5-9 galaxies are included, however, to allow migration 
of galaxies in-to and out-of the redshift bin of interest, an 
effect which can have a significant impact on the measured 
UV slopes. The luminosity function is integrated down to 
Muv(1500A) = — 16, fainter by « 2 mag than the least lu- 
minous HUDF objects in the McLure et al. (2011) robust 
sample of high-redshift galaxies. Those objects below the 
detection threshold are useful in providing the simulated 
image background with some of the non-uniformity seen in 
real data. 

We add artificial noise to these "perfect" images, de- 
signed to match the noise properties of a given survey. Ta- 
ble 1 lists the measured 5a limiting detection magnitudes 
for the ERS, HUDF09E1 and HUDF09FULL surveys and 
estimated limits for the HUDF12. These depths are com- 



puted from the standard deviation of 0.6-arcsec aperture 
fluxes placed on source-free regions of the images (McLure 
et al. 2011). To ensure consistency between these data and 
our simulations, the simulated noise is designed to match the 
data's depth when measured in this way, while also account- 
ing for the pixel-to-pixel correlation in the data. Specifically, 
we create a noise image for which every pixel is assigned a 
gaussian random value 

( /(^ or or) 2 \ 

n px = random [ fi = 0, a = * — — , (2) 

where (Taper is the 1°" limiting aperture flux corresponding 
to AB5 CT - the limiting detection magnitude of the survey, 
pre-corrected for pixel-to-pixel correlation by 

faper = — logic (zeropoint - (AB 5ct + |)) ■ ( 3 ) 

The noise image is then smoothed with a gaussian filter of 
standard deviation s px along each axis, which correlates 
the pixel noise such that the standard deviation of aperture 
flux in source-free regions yields the desired limiting magni- 
tude, yet with global RMS noise (which defines the detection 
threshold) very comparable to the real data. 

3.3 Object recovery 

For all our simulations, objects are recovered using 
SExtractor 2.8.6 (Bertin & Arnouts 1996) in dual- 
image mode. Objects are selected, unless otherwise 
noted, from the J125 image down to the 1.4 a level 
(DETECT_THRESH=1 . 4 , THRESH_TYPE=RELATIVE) for two ad- 
jacent pixels (DETECT_MINAREA=2). This method typically se- 
lects w 10 — 20 x A^nput objects. Photometry is performed 
in 10-pixel (0.6-arcsec) diameter apertures for all bands. 
The resulting catalogues are cut such that MAG_APER (de- 
tection band) is brighter than the 5 a limit, retaining ap- 
proximately 70 per cent of the input objects that were in- 
trinsically brighter than the 5 a limit. The fluxes of objects 
in these catalogues are corrected to total assuming point 
source aperture corrections for each band. A z « 7 sam- 
ple is then obtained by applying a selection function to the 
catalogue, either a colour-colour cut or a full photometric 
redshift selection. In the latter case, we find the redshift 
and best-fitting stellar population using FAST (Kriek et al. 
2009) with a wide library of BC03 models. For clarity, both 
input and fitted SEDs contain only simple stellar popula- 
tions. Age and metallicity are fitted, however, and the mod- 
els available to FAST include those that were used for input. 
As Finkelstein et al. (2012b) discuss, the actual choice of 
models should have little influence on the /? values, provided 
a wide range of models are available. Thus, fully investigat- 
ing the degeneracies between population parameters is not 
necessary in order to measure /3 (although see Section 3.4). 
Following Dunlop et al. (2012), we split the sample into RO- 
BUST and UNCLEAR categories. The robust sample contains 
only galaxies whose primary photometric redshift solution 
at 6.5 < z < 7.5 is preferred to any secondary solutions by 
A\ 2 ^ 4. Galaxies failing to meet this criteria but none the 
less having a preferred 2 ~ 7 solution are denoted unclear. 
Hereafter, we refer to the combination of robust+unclear 
as ALL. 

Recovered object candidates are paired to input objects 



6 A. B. Rogers, R. J. McLure and J. S. Dunlop 



Table 1. Limiting magnitudes (5<r depths) of the datasets considered in this work are shown for HST ACS and WFC3 imaging. The 
ERS and HUDF09E1 depths are measured following the method of McLure et al. (2011) and using their image reductions; HUDF09FULL 
depths from a consistent treatment of the data released by Bouwens et al. (2012); approximate depths for the forthcoming HUDF12 
WFC3 programme are marked by *. The depths are taken in 0.6-arcsec diameter circular apertures in blank regions of the images 
as described in McLure et al. (2011). HUDF09E1 and HUDF09FULL refer to the first and both epochs of the HUDF09 programme 
respectively. tWhile the HUDF12 programme features no further observations in F125W, modest depth improvements are expected from 
improved reductions of existing data. 
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«775 
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^098 
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J125 


^140 


#160 


ERS 


27.7 


27.9 


27.3 


27.1 


27.2 




27.6 




27.3 


HUDF09E1 


29.0 


29.5 


29.2 


28.5 




28.6 


28.7 




28.7 


HUDF09FULL 


29.0 


29.5 


29.2 


28.5 




28.7 


28.9 




28.8 


HUDF12 


29.0 


29.5 


29.2 


28.5 




29.5* 


29.0*t 


29.0* 


29.0* 



based on their recovered positions. Strict position match- 
ing (< 2-pixel radial offset) leaves a sample free of pure 
noise spikes (< 1 per cent at 5 a), yet retains objects where 
noise spikes, having randomly boosted the flux in individ- 
ual bands, significantly alter the colours. Detected objects 
for which identification of the corresponding input object is 
ambiguous are similarly dropped (~ 1 per cent), although 
this is minimized by avoiding significant crowding in the sim- 
ulations. Objects are only deemed "ambiguous" when two or 
more input objects lie within 2-pixels of a recovered object's 
position. With aperture diameters of 10-pixels, there is still 
ample ability for faint background objects to contribute to 
the recovered flux of the detected object. In summary, our 
selection criteria are: 

S/N(J) > 5. 

6.5 < Zphot(FAST) < 7.5. 

Sky position is within 2-pixel of a single input object. 
ROBUST: x 2 (secondary z) - \ 2 (primary z) ^ 4. 
UNCLEAR: < x 2 (secondary z) - \ 2 (primary 2) < 4. 

Overall these criteria allow the relevant measurement biases 
to become manifest while allowing precise tracing of input to 
recovered parameters. Having created catalogues with both 
input and output redshift and photometry parameters, the 
three /3 measurements for this final sample are made based 
on the three methods presented above. 

3.4 Synthesised noise vs. real noise 

As discussed above, we have opted to simulate the noise 
properties of deep HST images rather than inject sources di- 
rectly into the real images. This approach is chosen to allow 
a consistent treatment of the forthcoming HUDF12 imag- 
ing. We verify that our simulated noise maps yield equiva- 
lent results to a source injection scheme by inserting PSFs 
both into the real HUDF09E1 imaging and into synthetic 
images with noise designed to match the measured depths 
of the real data. Objects are detected and extracted in an 
identical manner in each case. For the source injection cata- 
logue only, the catalogues are pruned of any objects already 
present in the unmodified HUDF09E1 images before con- 
tinuing with the photometric redshift analysis. Fig. 4 shows 
the resulting /3 - J125 scatter for each approach. Very simi- 
lar widths in the scatter of /3 are seen at each magnitude, 
with no significant offset in colour or magnitude between 
the samples: at J125 = 28.0 ± 0.25 both cr(0) and (/3) differ 




J]25 



Figure 4. Comparison of the colour-magnitude scatter of sim- 
ulated z ft: 7, /3i n = —2 galaxies in the HUDF09E1 data using 
source injection into the real images (blue points) or synthesised 
noise in blank images (red points). The distribution and scatter 
of measured UV slopes fij—u is very similar in both cases. One 
third fewer objects are recovered from a source injection simula- 
tion (where some inserted objects fall onto existing sources) than 
in the simulated noise simulations, but for clarity the number of 
objects in each data series is identical in this figure. In both cases, 
ALL (both ROBUST and unclear) 6.5 ^ 2 < 7.5 sources are shown. 
Dashed lines show the input (5 and the effective 5 cr detection limit 
of the data. 



by ~ 0.1 (simulated noise = — 2.3, cr(/3) = 1.0; source 
injection (/3) = — 2.2,<t(/3) = 1.0). This confirms that the 
scatter in /3 for faint, low SNR objects is well reproduced by 
the simulated noise scheme used throughout this work. 



3.5 Extended sources vs. point sources 

In the simulations presented here, we have simulated faint 
z ~ 7 galaxies with PSFs. While the faint galaxies our simu- 
lations are designed to replicate are very nearly unresolved, 
we have none the less performed a conservative test of this 
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Figure 5. Distribution of f} vs. apparent magnitude from a 
simulation in which all objects have an intrinsic UV slope of 
P = -2. Similar to fig. 7 of Dunlop et al. (2012), all objects 
have 6.5 ^ ^ini^phot < 7.5 and J^f" < 30. Red and blue sym- 
bols denote the ERS and HUDF09E1 simulations respectively. 
Open circles denote objects whose photometric rcdshift is deemed 
UNCLEAR, filled circles are objects with ROBUST photometric red- 
shifts. In contrast to Dunlop et al. (2012), this plot includes a 
colour correction of A(/}j_jy) ss+0.2 to account for the flux of 
a point source not enclosed by the photometric aperture in each 
band. 

assumption. We have drawn half-light radii from a gaussian 
distribution centred on 0.65 kpc with <r=0.15 kpc, consis- 
tent with the size- luminosity results of Oesch et al. (2010a) 
for L > 0.3L* galaxies at z ~ 7, and convolved correspond- 
ing GALFIT (Peng et al. 2010) models with the PSF. The 
measurement of /3 is unaffected if we use these sources in 
our simulations rather than PSFs. This is as expected given 
that our chosen aperture diameter size of 0.6-arcsec corre- 
sponds to ~ 5R e on average. 



4 COMPARISON TO DUNLOP ET AL. (2012) 

In contrast to the main simulation approach adopted in 
this work, Dunlop et al. (2012) injected PSFs into the real 
HUDF09E1 and ERS images. Crowding was avoided by 
inserting sources only within the detection redshift range 
(6.5 < z- ln < 7.5). Furthermore, only objects with J}^ 11 * < 
30 were included - preventing excess noise being supplied 
by extra ultra- faint sources. 

Our new simulations, when limited to the same inputs 
and selection function, yield results in very good agreement 
with those of Dunlop et al. (2012). Fig. 5 shows the recovered 
Pj-h values for a simulated population of faint /3[ n = —2 
objects in the HUDF09E1 and ERS fields, and is remark- 
ably similar to fig. 7 of Dunlop et al. (2012). There is a 
clear offset to blue /3s, which becomes progressively worse 
for fainter objects. In the HUDF simulation, objects in the 



faintest 1 mag bin average (/3) = —2.4. This is even more 
pronounced in the ERS, where the J-band imaging is deeper 
than the ff-band imaging and where the Ferns filter, which 
cuts off at a shorter wavelength than the HUDF's Y105 filter, 
is used. The bias in the ERS becomes catastrophic for the 
faintest objects (objects in the faintest 1 mag bin average 
(j3) = —2.7). The photometric redshifts of even relatively 
high SNR objects are often deemed unclear when /3 is red, 
meaning a ROBUST sample excluding such red objects will 
show a further blueward bias. Corroborating the work of 
Dunlop et al. (2012), we find measuring /3 from a single 
J — H colour from a sample of z « 7 galaxies yields a large 
blue bias for the lowest SNR objects. 



5 DISCUSSION OF SIMULATION RESULTS 

In this section, we use simulations to investigate how both 
the z ~ 7 LBG selection method and the UV slope measure- 
ment method affect the measurement of j3. Simulations, as 
described above and using /3i n trinsic = —3 and —2, have been 
constructed of the HUDF -09E1, -09E2, -12 and the ERS 
datasets. The depths of these datasets are given in Table 1. 
We use these eight simulations throughout the remainder of 
this work. 

5.1 Comparison of z ~ 7 selection functions 

Many high-redshift galaxy studies have relied on colour- 
colour criteria for sample selection rather than using a full 
SED-fitting photometric redshift code. Here we show an 
illustrative comparison of those colour-colour criteria em- 
ployed by Bouwens et al. (2010b, 2012) and of our photo- 
metric redshift selection using BC03 template SEDs. 

The colour-colour selection criteria described in 
Bouwens et al. (2012, hereafter B12) differ from those of 
Bouwens et al. (2010b, hereafter B10). In both cases, the 
main selection criteria is a 'zsso-drop': a z — Y colour of 
> 0.8 (in B10) or > 0.7 (in B12). Both studies also prohibit 
the selection of red objects, requiring Y — J < 0.8. B12 use 
an additional z — Y vs. Y — J colour function, excluding low 
redshift interlopers that would otherwise have been newly 
selected following the relaxed z — Y criteria. In both stud- 
ies, various criteria are used to ensure objects with optical 
detections are excluded. Crucially, B10 report the use of a 
J125 ^ 5.5 a cut to their catalogue - a criterion that, as we 
shall see, is bound produce a bias towards the selection of 
objects with blue J — H colours. This cut is (apparently) 
abandoned by B12, the faint limit of the catalogue being 
determined instead by a probability threshold in the detec- 
tion image (a \ 2 image which in this case results in a similar 
selection to a Y + J + H stack; Szalay, Connolly, & Szokoly 
1999). 

We have approximately replicated the selection meth- 
ods of B10 and B12, using our HUDF09E1 /3 in = -2 simu- 
lation. The same simulation is used in both cases to allow 
a comparison of the selection functions independent of the 
data variation. In this case a \ 2 detection image, created 
from the Y, J, H images following the procedure of Szalay 
et al. (1999), was used for object detection with aperture 
photometry performed on individual bands as usual. Red- 
shift z ~ 7 galaxies were selected according to the criteria 
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Figure 6. Comparison of various selection functions, shown 
by average UV slopes from simulated galaxies in the 

HUDF09E1 in magnitude bins of fixed occupancy. Galaxies have 
been selected from these images with four selection methods. 
Thick and thin red lines show objects selected according the the 
dropout criteria of Bouwens ct al. (2010b) and Bouwens et al. 
(2012) respectively. Hollow and solid blue lines show, respectively, 
objects selected by a full photometric redshift analysis when ALL 
z ~ 7 candidates are included and when only objects with RO- 
BUST photometric redshifts are included. Comparing the two thick 
lines, we conclude that an inclusive photometric redshift selec- 
tion and a traditional colour-colour selection suffer similar bias 
in /3. As expected, a more exclusive photometric redshift selec- 
tion yields a larger blue bias. The vertical, dashed line shows the 
effective 5<r flux limit in the J125— band. No J-band SNR cut 
is used in the B12 selection method, hence the long tail toward 
faint magnitudes. In that case, the dotted red line is a guide 
to the J125— band magnitude corresponding to the 5cr limiting 
magnitude of the stack, assuming 2 = 7 and fiat-spectrum. These 
objects, detected from a combined Y, J, H image, only rise above 
the detection threshold due to noise-spikes boosting the //-band 
flux - hence their red Pj—h colours with respect to the input 
(horizontal, dashed line). 



of B10 and B12 independently. Two further catalogues were 
created by selecting ALL sources with photometric redshifts 
6.5 < 2 < 7.5, and only ROBUST 6.5 < 2 ^ 7.5 sources. 
As the colour-colour selections give no precise photometric 
redshifts, P is measured in all cases from the J — H colours. 

In Fig. 6, (Pj-h) is shown for each catalogue as a func- 
tion of J125 magnitude. We find that the standard photomet- 
ric redshift selection and the colour-colour selection of B10 
are similarly biased toward blue /3 values for faint galaxies. 
A photometric redshift selection is only excessively biased 
if some additional criteria are used to robustly exclude low 
redshift interlopers (i.e. A\ 2 ^ 4). 

Bouwens et al. (2012) show, in their fig. 5, that the 
B12 selection criteria yield an almost negligible bias in the 
average UV slope (P) even for very faint simulated galaxies 
at 2 ~ 7. In contrast, Fig. 6 of this work shows substantial 
bias in the B12 selected catalogue. This discrepancy is due 



to the choice of which observed data are used as a proxy 
for UV luminosity. Here we have used J125, as this probes 
rest-frame M1500 most closely throughout the 2 ~ 7 bin. As 
also noted by Finkelstein et al. (2012b), the clarity of the 
dependence of P on M1500 is reduced if one chooses to use 
miR ~ (Y, J, H) as a proxy for Misoo- 

The difference between the faint ends of the B10 and 
B12 colour-magnitude relations in Fig. 6 is striking. The 
removal of an explicit 5.5 a cut in J125 by B12 allows many 
sources with low J-band SNR to be included, as seen in Fig. 
7. In order to be detected in an IR stack, these objects must 
be flux-boosted in the //-band, consequently giving them red 
J — H colours (the Y"-band flux is moderately attenuated at 
2 = 7). This is clear from Fig. 7, which shows a comparison 
of the B10 and B12 selection functions based on the same 
simulation as Fig. 6. The addition of faint, red-scattered, 
sources in the selection function of B12 perhaps accounts 
for why they report a somewhat redder (/3) ~ —2.7 than 
B10 ((/3) ~ —3.0) for the faintest galaxies. 

In summary, we find that an inclusive photometric red- 
shift selected sample and a 2 ~ 7 colour-colour selected sam- 
ple are similarly biased at the faint end by preferential se- 
lection of blue-scattered objects. 

5.2 Comparison of f3 measurement methods 

Fig. 8 shows the recovered UV slopes of simulated objects 
in the HUDF09E1, HUDF09FULL, HUDF12 and ERS fields 
as a function of brightness. In each simulation, /3 has been 
recovered using the three methods described above. Faint 
objects show extreme scatter in their UV colours, which is 
maximised when using only a single colour measurement. 
The scatter becomes extreme at ~ 0.5 mag brighter than 
the 5 a limit in each field. In the HUDF12, the addition 
of the J140 band primarily benefits the power-law method; 
although the other methods do benefit indirectly via im- 
proved redshift recovery. In Fig. 8, the bias toward faint, 
blue sources appears similar for the two input /3s (except 
for when using /3bco3)- If we had considered only ROBUST 
2 ~ 7 objects, the bias would be more apparent in the 
/3i n = — 2 simulation. Although the selection function is iden- 
tical for both simulations, where /3i n = —3 there is a larger 
colour space available redward of the intrinsic colour. Ob- 
jects can be scattered into this colour space while still being 
robustly placed at high redshift. For example, if a galaxy 
with /3i n = —2 is scattered by A/3 = +2 it is liable to be 
considered a potential low-redshift contaminant and there- 
fore deemed unclear. With the same scatter, a galaxy with 
Pi n = —3 will be left with P.j~h = — 1 and will likely be kept 
as a robust high-redshift candidate. 

As can be seen in Fig. 1, an SED fit at 2 = 7 is es- 
sentially a fit only to (z)YJH photometry, with all bluer 
bands providing non-detections as they lie blueward of the 
Lyman break. It is therefore unsurprising that the /3 distri- 
butions as measured from the best-fitting model and from 
a power-law fit to the YJH photometry are somewhat sim- 
ilar for Pin = —2. However, objects with observed P < —3 
are unable to have their colours reproduced by the limited 
parameter space of population synthesis models. Thus an 
apparent tightening of the recovered P distribution is seen 
using the best-fitting model method, by virtue only of an 
a priori assumption of how blue the UV slope may be. In 
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Figure 7. Comparison of the colour-colour selection functions, and resulting samples from simulated images, of B10 (Bouwens et al. 
2010b) and B12 (Bouwens et al. 2012). Small black dots mark galaxies selected from a HUDF09El-like simulation, where all galaxies 
have 0- la = —2, using the selection function of B10. The vertical axes shows the Lyman break size (z ~ 7, zgso-dropouts). The horizontal 
axes shows the Lyman-a-to-UV colour. Coloured circles show a catalogue, selected from the same images, using the B12 selection criteria. 
Bluer symbols denote steeper UV slopes /3, measured by the J — H colour. Larger symbols denote the galaxies faintest in J125 (i.e. largest 
magnitude). The B12 selection allows galaxies faint in J125, with consequently red J — H colours to be selected - the large, red circles 
in the lower left of the plot - which were not selected in B10 due to a J125 ^ 5.5 a cut. Objects selected by BlO's criteria in the lower 
right of the plot are treated as contaminants by B12. As can be seen from the colours of nearby objects, many of these would hold blue 
J — H colours. This combination of changes will clearly allow B12's selection criteria to yield a redder average /3 - for the same data - 
than that of B10. 



fact, were the intrinsic colours of faint z fa 7 galaxies as 
blue as /3 — —3, a population average of (/3bco3) would not 
yield this result, but rather a red-biased (/3) w —2.8 (in the 
faintest 1 mag bin of our /3j n = —3 simulation). This creates 
a complicated bias function, since the method returns a blue 
biased (/3) fa —2.4 when /3i n = —2 (in the same bin). The 
artificial tightening of the scatter is also severe: in the same 
bin, <t(/3bco3) = 0.8 or 0.5 for f3- m = —2 and —3 respectively. 
For comparison, the faintest 1 mag bin of the /3i n = —2 and 
—3 simulations yield {@yjh) = 2.3 ± 0.9 and —3.0 ± 0.8, re- 
spectively. Fundamentally, given that (/?) appears to evolve 
only mildly with increasing redshift such that the intrinsic 
/3 distribution likely has an average of (/3) > —2.5, trun- 
cating one side of the colour scatter will clearly yield un- 
representative measurements of {/3) and <r(/3). This is not 
to discount the use of the best-fitting model method out- 
right: Finkelstein et al. (2012b) have shown strong support 
for the method in similar source recovery simulations. They 
inserted a population of objects whose distribution of stel- 
lar population parameters matched those observed in their 
HUDF sample. For the average galaxy in that population, 
and particularly at z < 7, Finkelstein et al. (2012b) found 
that P was recovered most successfully from the best-fitting 
model; a result we reproduce in that the scatter is mini- 
mal in /3bco3- Finkelstein et al. (2012b) acknowledge that 
the method would break down were the parameter space 
edge to be reached, and the crux of our argument against 
this method is that the faint, blue, z « 7 galaxies we simu- 



late here exceed that limit (due to the impact of noise). For 
galaxies detected with high significance, the method per- 
forms well and there is no evidence that their colours are 
not reproducible by the stellar population synthesis models 
in an SED fit. A power-law fit to YJH photometry, attenu- 
ated with a Lyman break as prescribed by the photometric 
redshift, yields UV slopes closer to their intrinsic values than 
does the J — H colour, yet without the bias introduced by 
the assumption that the observed colours of low signal-to- 
noise objects should be reproducible by stellar population 
models. 

The complexity of the bias function for the best-fitting 
model method (in that it is dependent on the intrinsic ft) is 
compounded by the reliance on accurate photometric red- 
shifts. This reliance is shared by the power-law method. 
Where the colours of a galaxy are reproducible by models in 
the photometric redshift model set, redshift recovery is gen- 
erally good: jzphot — Zixx\ ^ 0.1. However, even with perfect 
HUDFOfJ-like photometry of a galaxy with /3- ln — —4 (pure 
power law) at Z[ n = 7, a photometric redshift of 6.86 is ob- 
tained using BC03 models - the redshift is underestimated 
in order to account for the "excess" flux in the T"-band. For- 
tunately, the power-law method is reasonably robust to this: 
adopting z = 6.86, a YJH power-law fit for /3 then yields 
ft ~ —3.8 - the value of j3 being tempered slightly toward 
the colours of the model in the photometric redshift fit. This 
bias is clearly still smaller than that seen when /3 is measured 
directly from the best-fitting model. 
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Figure 8. Comparison of the three ft measurement methods, described in Section 2, for simulated objects in the ERS, HUDF09 (epochs 
1 and 2) and HUDF12 datasets. For each method (differentiated by colour of symbol), objects' UV slopes are plotted as a function of 
detection- band magnitude (J125, AB mag, corrected to total flux). Within each simulation, all objects have a single Antrinsie as shown 
in the top-right corner of each panel. All objects with photometric redshift 6.5 ^ z ^ 7.5 are included, regardless of the robustness of 
the redshift fit. The resulting /3 values as measured from the J — H colour only or a fit to the best-fitting BC03 model's rest-frame UV 
continuum are shown by red and blue dots respectively. Green dots show p measured using our preferred method: a pure power-law fit 
to the YJH photometry (VJ125 J\4,qH in HUDF12), attenuated with a Lyman break cutoff to the power-law at (1 + z p hot) x 1216 A. 



6 MEASUREMENTS OF (3 FOR EXISTING 
z « 7 GALAXY CANDIDATES 

Our simulated observations show that, for both the HUDF09 
and ERS datasets, a power-law fit to YJH photometry pro- 
vides a more reliable measurement of the population's UV 
slopes than a single J — H colour. We have therefore re- 
analysed the photometry of the z ~ 7 sample of HUDF09 
and ERS galaxies, provided by Dunlop et al. (2012), using a 
YJH power-law fit. McLure et al. (2011) provide a detailed 
description of the photometric redshift selection of a simi- 
lar sample; the Dunlop et al. (2012) sample we use here is 
more inclusive in that it includes all high-redshift galaxy 
candidates with both unclear and ROBUST photometric 
redshifts. In line with the rest of this work, these catalogues 
have been pruned of any objects with ,/125-band photometry 
fainter than the 5 a limit. In addition, we have updated the 
photometry of the Dunlop et al. (2012) HUDF09E1 sam- 
ple using the full-depth HUDF09FULL dataset so that we 
can compare ERS, HUDF09E1 and HUDF09FULL z « 7 



catalogues to our simulations as shown in Fig. 9. It is im- 
mediately clear that within each dataset there is a trend 
toward blue /3s at faint magnitudes. This is true not only 
for the ROBUST objects, but for all. However, this trend is 
mirrored by the /3 = —2 simulation (green and red dots in 
the figure). In fact, the faint bin of the HUDF09E1 sample 
averages (/3) = — 2.6 ± 0.2 which is only marginally bluer 
than (/3 s im) ~ —2.4, the biased measurement reached by 
our intrinsically flat-spectrum simulation in the same lu- 
minosity bin (see also Fig. 6). Moreover, as shown in the 
right-hand panel of Fig. 9, the higher signal-to-noise deliv- 
ered by the complete HUDF09FULL dataset yields redder 
values ((/?} = —2.3 for the faintest luminosity bin) as ex- 
pected if (P) is significantly biased by photometric scatter. 
Thus, while some of the HUDF's brightest 2 ~ 7 galaxies - 
which have ROBUST photometric redshifts - appear redder 
than /3 — —2, there is currently no convincing evidence that 
the faintest objects are significantly bluer than that. 

It is of course possible to use a suite of simulations to 
determine the intrinsic distribution most likely present in 
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Figure 9. HUDF09E1 and ERS 6.5 ^ z < 7.5 galaxies from the Dunlop et al. (2012) sample and our ft n = —2 simulations. In the 
upper panels, the data are shown as solid (robust photometric redshift) and open (unclear photometric redshift) squares. Robust 
and UNCLEAR simulated sources are shown by green and red dots, respectively. The right panel shows the sample selected from the 
HUDF09E1, but with the objects' photometry updated using the HUDF09FULL dataset. f3 is measured using a Lyman break truncated 
power-law fit to the available YJH photometry. The lower panels show running means, (/3) ± Std. Err., for the simulations (red regions) 
and the faint (well sampled) end of the data binned by magnitude (black squares). In the lower panels, ALL objects are included. A clear 
trend is seen for faint, ROBUST objects to have blue UV slopes; all faint, red objects are assigned UNCLEAR photometric redshifts. At the 
faint end, this trend is reproduced by our f) = —2 simulation, although the scatter is such that the faintest objects are consistent with 
a /9 = —3 simulation (not shown for clarity, but see Fig. 8). Galaxies in the ERS are slightly redder than a /3 = —2 simulation would 
predict, as are the brightest galaxies in the HUDF. For each dataset, the inverse-variance weighted mean (/3) ± 1 standard error is given 
both for ALL sources, and for the ROBUST sub-sample which is consistently slightly bluer. At the faint end ( J125 ^ 28 in the HUDF09, 27 
in the ERS), including UNCLEAR sources is more significant. For example their inclusion reddens the faint HUDF09FULL (/3) from —2.7 
to -2.3. 



the observed galaxy sample. In a future paper we intend to 
investigate the constraints which can be placed on the in- 
trinsic P distribution at z ~ 7, armed with the new HUDF12 
dataset. The new HUDF12 data will also allow us to deter- 
mine the slope of any colour-magnitude relation present at 
z ~ 7. There has been gradual convergence toward quantify- 
ing this relation at 3 < z < 6 (e.g. Finkelstein et al. 2012b; 
Bouwens et al. 2012) but, as this work has shown, this re- 
mains unclear at z > 7. The bias toward blue ft values, which 
as we have seen is present in a variety of selection methods, 
is a function of SNR. Thus any attempt at constraining a 
colour-magnitude relation over a wide magnitude-baseline, 
e.g. using HUDF + ERS + CANDELS data, requires a care- 
ful field-by-field approach to the bias correction. 



7 STRATEGIES FOR THE HUDF12 

The forthcoming HUDF12 programme (G012498) will pro- 
vide significantly improved photometry of high-redshift 
galaxies in three complementary ways. First, the depth of 
the Y105 band will be increased to a detection limit of 
30 AB mag (5 o, 0.4-arcsec diameter aperture; 29.6 AB in 
a 0.6-arsec diameter aperture) providing robust photomet- 
ric redshifts of 'V-drop' galaxies at z > 8. Second, imag- 
ing through the additional J140 filter will be added reach- 
ing a depth equalling that of the current Ji25-band data in 
the HUDF09FULL (see Table 1). Finally, the H 160 imag- 
ing will be increased in depth to match that achieved in 
J140 and J125, allowing more secure colour measurements 
and minimizing bias in source selection. This will allow 
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Iros, J125, J140, -Hi60 (hereafter YJJH) photometry to be 
used for fitting the UV SED of galaxies at 2 « 7, as we 
have simulated in this work. A more selection-independent 
measurement of f3 will also be possible by detecting objects 
in the Ji4o-band, with J125 — -H160 being used as the colour 
measurement. Here, we outline the benefits of these two new 
measurements of fj at z w 7, each of which will only be pos- 
sible following the HUDF12 observations. 

7.1 J — H colours of Ji4o-selected galaxies 

As discussed in Section 2, a single colour measurement in 
J125 — Hiqq provides a simple estimate of j3 at 2 « 7. While 
we have seen that selecting galaxies (via SExtractor) in 
the Ji25-band preferentially selects blue galaxies, yielding 
biased (/3) values, this can be avoided in the HUDF12 by 
selecting in the Ji4o-band. This method will alleviate some 
of the 'flux-boosting' induced blue bias that is found when 
J125 is used both to detect and determine the colour of 2 « 7 
galaxies. To quantify the expected benefit of this approach, 
galaxies from our HUDF12 /3j n — —2 simulation have been 
independently selected in both the J125- and Ji4o-bands. 
Photometric redshift selection of 2 « 7 galaxies is performed 
using all bands for both of these catalogues. In Fig. 10, the 
average UV slope {0) is shown as a function of selection band 
magnitude for each catalogue. Measurement of (/3) for bright 
galaxies is not affected by the choice of selection band, but 
within 1 mag of the 5 a image depth a Ji4o-selected cata- 
logue clearly provides a less blue-biased measure of (j3) than 
a Ji25-selected catalogue. An inclusive Ji4o-selected photo- 
metric redshift catalogue allows an essentially unbiased mea- 
surement of (/?}, and Ji4o-selection somewhat reduces the 
bias in a robust photometric redshift catalogue. Reassur- 
ingly, an unbiased measurement of (0) is possible without 
resorting to the artificial neglect of certain bands from the 
photometric redshift analysis as was suggested by Bouwens 
et al. (2012). 

7.2 Power-law j3 measurements 

We have seen that a Ji4o-selected catalogue, with j3 mea- 
sured via the independent J125 — -H160 colour, is less blue bi- 
ased than a ,/125-selected catalogue. This is also true when /3 
is measured via a power-law fit to YJJH, but for more sub- 
tle reasons. The primary cause of bias in /3 is flux boosting of 
faint objects to just above the detection threshold. Galaxies 
boosted in J125 are bound to be measured blue: the colour is 
always blue relative to both J140 and Hieo- S However galax- 
ies boosted in J140 hold a blue J140 — iJ"i60 colour but a red 
J\ih — J140 colour of similar SNR. Thus, power-law j3 mea- 
surements benefit from a Ji4o-selection to a similar degree 
as the j3 measurements obtained via the J — H colour. The 
proposed addition of J140 photometry in the HUDF12 also 
makes possible a multi-band power-law fit to the UV con- 
tinuum neglecting the Yios-band in which the Lyman break 
falls. However with the Y-band offering the greatest depth 
in the HUDF12, it is not immediately obvious whether its 
exclusion from the j3 measurement will be beneficial or not. 

3 The y-band photometry carries lower weight, being partially 
attenuated by the Lyman break. 



Using the Ji4o-selected catalogues described in 7.1, /3 has 
been measured, separately, using truncated power-law fits 
to JJH and YJJH photometry. From the results shown in 
Fig. 11, we can see that the inclusion of the Lyman-break af- 
fected V-band photometry does not appreciably reduce the 
bias in the average UV slope at faint magnitudes. While the 
inclusion of the V-band greatly benefited the measurement 
of /3 in the HUDF09, it can be excluded in the HUDF12 with 
only the modest cost of an increase in the scatter of ft for the 
faintest galaxies. This is beneficial as a robust measurement 
of (0) can then be obtained, via JJH photometry, without 
relying on the Lyman break affected V-band. 

7.3 Lyman-a emitter contamination 

In the HUDF12 simulations, we have seen that the inclusion 
of Y-band photometry only mildly improves the measure- 
ment of j3 in LBGs with no Lyman-a emission. However, the 
y-band at z « 7 probes the Lyman-a line and the compari- 
son of JJH to YJJH fits may be a useful check for the pres- 
ence of Lyman-a emission. Furthermore, the existing /3yjh 
measurements for HUDF09 galaxies could potentially be af- 
fected by Lyman-a emission; excluding the y-band photom- 
etry for these fits would reduce the measurement to only a 
single J — H colour. 

We have performed simple simulations of the effect 
of Lyman-a emitters (LAEs) on recovered /3 values as fol- 
lows. First, pure power-law spectra were created with lri = 
{-3, -2, -1}, truncated blue-ward of (I + 2) x 1216 A. By in- 
tegrating under the rest-frame UV continuum in the range 
1216 A - 1216+EW A, the Lyman-a line flux was calcu- 
lated and added to the flux at 1216 A. Fig. 12 shows the 
impact of Lyman-a emission of various equivalent-widths 
on the power-law derived /3 value from yios, J125, J140, #160 
photometry (as is present for the HUDF12 - the effect is 
marginally stronger in the HUDF09 without J140 imag- 
ing). Based on observations out to 2 « 6, Stark, El- 
lis, & Ouchi (2011) predict, in their faint luminosity bin 
(26.7 < J125 < 28.2), a 2 = 7 Lyman-a EW distribu- 
tion peaked in the 25 < EW < 55 A bin. Galaxies with 
EW > 85 A represent ~ 5% of the population. Thus we 
can expect a bias of A/3 « —0.5, if the LAE fraction does 
not tail off at 2 « 7. (However, Bolton & Haehnelt (2012) 
have shown that only a small (~10%) neutral fraction in the 
IGM may be sufficient to significantly reduce the transmis- 
sion of Lyman-a, and therefore the typical EW of Lyman-a 
at 2 « 7.) 

This bias is maximised by not floating the redshift here. 
In practise, the photometric redshift code will select a lower 
redshift thereby accounting for excess V-band flux by a 
lower-wavelength Lyman break. At 2 = 7, 50 A EW Lyman- 
a can be countered by misplacing the redshift by Az « —0.3. 
Moreover, the inclusion of Lyman-a emission in photomet- 
ric redshift fits is commonplace. In principle this allows 2, /3 
(continuum) and the EW of Lyman-a to be correctly deter- 
mined. 

In fact, for our sample of HUDF09 and ERS objects, 
including the V-band in the measurement of j3 typically re- 
turns colours redder than when using only (3j-h- Thus any 
Lyman-a present in those galaxies is not boosting the Y- 
band flux to the extent that Pyjh is measured with a blue 
bias. We can therefore conclude that either no high EW 
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Figure 10. Comparison of the bias in a simulated sample 
of z ss 7 galaxies in the HUDF12, selected in either the J125 
(upper panel) or J140 (lower panel) imaging. UV slopes, mea- 
sured via J125 — ^160 colours, are shown for galaxies in our 
HUDF12 /3; n = — 2 simulation. Filled and hollow circles mark 
objects with ROBUST photometric redshifts and UNCLEAR objects 
respectively. Average UV slope values (/?), in bins of selection 
band magnitude, are likewise shown by solid (robust) and hollow 
(all=R0BUST+unclear) lines. A catalogue produced by select- 
ing objects in the Ji4o-band (lower panel) shows a less blue-biased 
(/?) for faint galaxies than does a Ji25-band selected catalogue 
(upper panel). This is due to selection band flux boosting in the 
Jl25-selected catalogue fostering a sub-sample of sources which 
is very blue in J — H (see Sections 7.1 and 7.2 for discussion). 
The hollow line in the lower panel shows that a Ji4o-selected 
photometric redshift catalogue including all objects (robust and 
unclear) allows an unbiased average UV slope to be measured 
to low SNR (< 80-). 
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Figure 11. Comparison of the /3 bias in a simulated sample of 
z ~ 7 galaxies in the HUDF12, with (3 measured by a Lyman 
break truncated power-law fit to Y105, J125, J14O1 ^160 (upper 
panel) or only J125, J140, ^160 (lower panel). Filled and hollow 
circles mark objects with ROBUST photometric redshifts and UN- 
CLEAR objects respectively. Average UV slope values (/3), in bins 
of selection band magnitude, are likewise shown by solid (robust) 
and hollow (all=R0BUST+unclear) lines. For some objects, the 
blucward scattering of the J — H colour is tempered by the in- 
clusion of the Y — J colour, although this primarily reduces the 
width of the scatter and does little to alter (/3) . On an average ba- 
sis, there is therefore little benefit to including the Lyman break 
affected Y-band photometry in the fit. 



Lyman-ot is present, or that the low contribution of Lyman- 
ot is readily countered by an underestimated photometric 
redshift. Finally, as we have seen, the addition of J140 in the 
HUDF12 programme will render the Y-band photometry 
unnecessary in fitting /3 at z w 7, alleviating this problem 
in future analyses of HUDF data. 
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Figure 12. The effect of Lyman-ot emission of various equivalent- 
widths on the recovery of the UV slope /3. Perfect power-law spec- 
tra, with ft = { — 1, —2, —3}, were created and the flux at 1216 A 
boosted to include an emission line of the specified equivalent- 
width. Thick red, green and blue lines denote the recovered /3s at 
2 = 7 for each intrinsic ft respectively; thin lines (upper/lower) 
at z = 6.5/7.5. Here we assume perfect redshift and photometric 
recovery in the HUDF12's Y105, J125, ^140 and i?i60 bands. 



8 CONCLUSIONS 

We have performed object recovery simulations of z ~ 7 
objects in the HUDF and ERS fields, considering how the 
choice of selection function and ft measurement method af- 
fect the measured average UV slope (ft). 

(i) A robust measurement of the UV slope ft in the ERS 
and HUDF09 datasets is obtained when fitting a Lyman 
break truncated power-law SED to YJH photometry. In 
simulations, this method minimizes the scattering of ft away 
from the intrinsic value. The method performs similarly to a 
method advocated by Finkelstein et al. (2012b), but avoids 
the parameter space issues associated with that method - 
whereby the scatter in ft is artificially reduced for ultra- 
faint, blue, z ~ 7 objects - and outperforms the use of a 
single J — H colour. 

(ii) Our preferred method for measuring ft relies on a pre- 
cise photometric redshift measurement, thus can only be 
made using a full photometric redshift analysis. As such, 
and in contrast to claims elsewhere, we have verified that 
the total bias on the measurement of ft is similar for a full 
photometric redshift selection and for a colour-colour selec- 
tion. 

(iii) In doing so we have highlighted the sensitivity of 
recovered UV slope measurements to selection function 
choices. In particular, a comparison of the colour-colour se- 
lection functions of Bouwens et al. (2010b) and Bouwens 
et al. (2012) goes some way to explaining the difference in 
the average UV slope for faint z ~ 7 galaxies reported in 
those studies. Furthermore, we reiterate that excess bias in 



a photometric redshift selection function is only seen when 
optional criteria are added to robustly reject potential low- 
redshift interlopers. 

(iv) Using our preferred method, new UV slope measure- 
ments for a sample of z ~ 7 galaxy candidates (Dunlop et al. 
2012) have been made. The apparent colour-magnitude re- 
lation - whereby the faintest objects appear bluest - is well 
reproduced by a simulation in which the intrinsic UV colours 
of objects are flat-spectrum (ft = —2). Thus we find even 
faint z « 7 objects are able to have their colours reproduced 
by stellar population models of normal star-forming galaxies 
- requiring neither extremely young ages nor exotically low 
metallicities. 

(v) We have investigated strategies for minimizing the 
bias in the measurement of ft from the HUDF12 dataset, 
finding that using the new Ji4o-band imaging for detection, 
in combination with our preferred ft measurement method, 
will yield results with significantly smaller biases than pre- 
vious estimates. 

In a future paper including HUDF12 data, we aim to esti- 
mate the intrinsic distribution which underlies the observed 
UV colour distribution. 
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